scheduling policy
Learning to Decode: Reinforcement Learning for Decoding of Sparse Graph-Based Channel Codes
We show in this work that reinforcement learning can be successfully applied to decoding short to moderate length sparse graph-based channel codes. Specifically, we focus on low-density parity check (LDPC) codes, which for example have been standardized in the context of 5G cellular communication systems due to their excellent error correcting performance. These codes are typically decoded via belief propagation iterative decoding on the corresponding bipartite (Tanner) graph of the code via flooding, i.e., all check and variable nodes in the Tanner graph are updated at once. In contrast, in this paper we utilize a sequential update policy which selects the optimum check node (CN) scheduling in order to improve decoding performance. In particular, we model the CN update process as a multi-armed bandit process with dependent arms and employ a Q-learning scheme for optimizing the CN scheduling policy. In order to reduce the learning complexity, we propose a novel graph-induced CN clustering approach to partition the state space in such a way that dependencies between clusters are minimized. Our results show that compared to other decoding approaches from the literature, the proposed reinforcement learning scheme not only significantly improves the decoding performance, but also reduces the decoding complexity dramatically once the scheduling policy is learned.
The Silence that Speaks: Neural Estimation via Communication Gaps
Aggarwal, Shubham, Maity, Dipankar, Başar, Tamer
Accurate remote state estimation is a fundamental component of many autonomous and networked dynamical systems, where multiple decision-making agents interact and communicate over shared, bandwidth-constrained channels. These communication constraints introduce an additional layer of complexity, namely, the decision of when to communicate. This results in a fundamental trade-off between estimation accuracy and communication resource usage. Traditional extensions of classical estimation algorithms (e.g., the Kalman filter) treat the absence of communication as 'missing' information. However, silence itself can carry implicit information about the system's state, which, if properly interpreted, can enhance the estimation quality even in the absence of explicit communication. Leveraging this implicit structure, however, poses significant analytical challenges, even in relatively simple systems. In this paper, we propose CALM (Communication-Aware Learning and Monitoring), a novel learning-based framework that jointly addresses the dual challenges of communication scheduling and estimator design. Our approach entails learning not only when to communicate but also how to infer useful information from periods of communication silence. We perform comparative case studies on multiple benchmarks to demonstrate that CALM is able to decode the implicit coordination between the estimator and the scheduler to extract information from the instances of 'silence' and enhance the estimation accuracy.
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > North Carolina (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)
Finite-Time Minimax Bounds and an Optimal Lyapunov Policy in Queueing Control
Liu, Yujie, Tan, Vincent Y. F., Xu, Yunbei
We introduce an original minimax framework for finite-time performance analysis in queueing control and propose a surprisingly simple Lyapunov-based scheduling policy with superior finite-time performance. The framework quantitatively characterizes how the expected total queue length scales with key system parameters, including the capacity of the scheduling set and the variability of arrivals and departures across queues. To our knowledge, this provides the first firm foundation for evaluating and comparing scheduling policies in the finite-time regime, including nonstationary settings, and shows that the proposed policy can provably and empirically outperform classical MaxWeight in finite time. Within this framework, we establish three main sets of results. First, we derive minimax lower bounds on the expected total queue length for parallel-queue scheduling via a novel Brownian coupling argument. Second, we propose a new policy, LyapOpt, which minimizes the full quadratic Lyapunov drift-capturing both first- and second-order terms-and achieves optimal finite-time performance in heavy traffic while retaining classical stability guarantees. Third, we identify a key limitation of the classical MaxWeight policy, which optimizes only the first-order drift: its finite-time performance depends suboptimally on system parameters, leading to substantially larger backlogs in explicitly characterized settings. Together, these results delineate the scope and limitations of classical drift-based scheduling and motivate new queueing-control methods with rigorous finite-time guarantees.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Kansas > Russell County (0.04)
- Asia > Singapore (0.04)
Mixture-of-Schedulers: An Adaptive Scheduling Agent as a Learned Router for Expert Policies
Wang, Xinbo, Jia, Shian, Huang, Ziyang, Cao, Jing, Song, Mingli
Modern operating system schedulers employ a single, static policy, which struggles to deliver optimal performance across the diverse and dynamic workloads of contemporary systems. This "one-policy-fits-all" approach leads to significant compromises in fairness, throughput, and latency, particularly with the rise of heterogeneous hardware and varied application architectures. This paper proposes a new paradigm: dynamically selecting the optimal policy from a portfolio of specialized schedulers rather than designing a single, monolithic one. We present the Adaptive Scheduling Agent (ASA), a lightweight framework that intelligently matches workloads to the most suitable "expert" scheduling policy at runtime. ASA's core is a novel, low-overhead offline/online approach. First, an offline process trains a universal, hardware-agnostic machine learning model to recognize abstract workload patterns from system behaviors. Second, at runtime, ASA continually processes the model's predictions using a time-weighted probability voting algorithm to identify the workload, then makes a scheduling decision by consulting a pre-configured, machine-specific mapping table to switch to the optimal scheduler via Linux's sched_ext framework. This decoupled architecture allows ASA to adapt to new hardware platforms rapidly without expensive retraining of the core recognition model. Our evaluation, based on a novel benchmark focused on user-experience metrics, demonstrates that ASA consistently outperforms the default Linux scheduler (EEVDF), achieving superior results in 86.4% of test scenarios. Furthermore, ASA's selections are near-optimal, ranking among the top three schedulers in 78.6% of all scenarios. This validates our approach as a practical path toward more intelligent, adaptive, and responsive operating system schedulers.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Austria > Vienna (0.14)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- (8 more...)
AI for Distributed Systems Design: Scalable Cloud Optimization Through Repeated LLMs Sampling And Simulators
We explore AI-driven distributed-systems policy design by combining stochastic code generation from large language models (LLMs) with deterministic verification in a domain-specific simulator. Using a Function-as-a-Service runtime (Bauplan) and its open-source simulator (Eudoxia) as a case study, we frame scheduler design as an iterative generate-and-verify loop: an LLM proposes a Python policy, the simulator evaluates it on standardized traces, and structured feedback steers subsequent generations. This setup preserves interpretability while enabling targeted search over a large design space. We detail the system architecture and report preliminary results on throughput improvements across multiple models. Beyond early gains, we discuss the limits of the current setup and outline next steps; in particular, we conjecture that AI will be crucial for scaling this methodology by helping to bootstrap new simulators.
HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling
Maiterth, Matthias, Brewer, Wesley H., Kuruvella, Jaya S., Dey, Arunavo, Islam, Tanzima Z., Menear, Kevin, Duplyakin, Dmitry, Kabir, Rashadul, Patki, Tapasya, Jones, Terry, Wang, Feiyi
Schedulers are critical for optimal resource utilization in high-performance computing. Traditional methods to evaluate schedulers are limited to post-deployment analysis, or simulators, which do not model associated infrastructure. In this work, we present the first-of-its-kind integration of scheduling and digital twins in HPC. This enables what-if studies to understand the impact of parameter configurations and scheduling decisions on the physical assets, even before deployment, or regarching changes not easily realizable in production. We (1) provide the first digital twin framework extended with scheduling capabilities, (2) integrate various top-tier HPC systems given their publicly available datasets, (3) implement extensions to integrate external scheduling simulators. Finally, we show how to (4) implement and evaluate incentive structures, as-well-as (5) evaluate machine learning based scheduling, in such novel digital-twin based meta-framework to prototype scheduling. Our work enables what-if scenarios of HPC systems to evaluate sustainability, and the impact on the simulated system.
- North America > United States > Tennessee > Anderson County > Oak Ridge (0.50)
- North America > United States > Florida > Orange County > Orlando (0.14)
- North America > United States > Texas > Hays County > San Marcos (0.04)
- (16 more...)
- Energy (1.00)
- Government > Regional Government > North America Government > United States Government (0.64)
GATES: Cost-aware Dynamic Workflow Scheduling via Graph Attention Networks and Evolution Strategy
Shen, Ya, Chen, Gang, Ma, Hui, Zhang, Mengjie
Cost-aware Dynamic Workflow Scheduling (CADWS) is a key challenge in cloud computing, focusing on devising an effective scheduling policy to efficiently schedule dynamically arriving workflow tasks, represented as Directed Acyclic Graphs (DAG), to suitable virtual machines (VMs). Deep reinforcement learning (DRL) has been widely employed for automated scheduling policy design. However, the performance of DRL is heavily influenced by the design of the problem-tailored policy network and is highly sensitive to hyperparameters and the design of reward feedback. Considering the above-mentioned issues, this study proposes a novel DRL method combining Graph Attention Networks-based policy network and Evolution Strategy, referred to as GATES. The contributions of GATES are summarized as follows: (1) GATES can capture the impact of current task scheduling on subsequent tasks by learning the topological relationships between tasks in a DAG. (2) GATES can assess the importance of each VM to the ready task, enabling it to adapt to dynamically changing VM resources. (3) Utilizing Evolution Strategy's robustness, exploratory nature, and tolerance for delayed rewards, GATES achieves stable policy learning in CADWS. Extensive experimental results demonstrate the superiority of the proposed GATES in CADWS, outperforming several state-of-the-art algorithms. The source code is available at: https://github.com/YaShen998/GATES.
- Workflow (1.00)
- Research Report > New Finding (0.66)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
An efficient deep reinforcement learning environment for flexible job-shop scheduling
Wu, Xinquan, Yan, Xuefeng, Wei, Mingqiang, Guan, Donghai
The Flexible Job-shop Scheduling Problem (FJSP) is a classical combinatorial optimization problem that has a wide-range of applications in the real world. In order to generate fast and accurate scheduling solutions for FJSP, various deep reinforcement learning (DRL) scheduling methods have been developed. However, these methods are mainly focused on the design of DRL scheduling Agent, overlooking the modeling of DRL environment. This paper presents a simple chronological DRL environment for FJSP based on discrete event simulation and an end-to-end DRL scheduling model is proposed based on the proximal policy optimization (PPO). Furthermore, a short novel state representation of FJSP is proposed based on two state variables in the scheduling environment and a novel comprehensible reward function is designed based on the scheduling area of machines. Experimental results on public benchmark instances show that the performance of simple priority dispatching rules (PDR) is improved in our scheduling environment and our DRL scheduling model obtains competing performance compared with OR-Tools, meta-heuristic, DRL and PDR scheduling methods.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Multimodal Remote Inference
Zhang, Keyuan, Sun, Yin, Ji, Bo
We consider a remote inference system with multiple modalities, where a multimodal machine learning (ML) model performs real-time inference using features collected from remote sensors. When sensor observations evolve dynamically over time, fresh features are critical for inference tasks. However, timely delivery of features from all modalities is often infeasible because of limited network resources. Towards this end, in this paper, we study a two-modality scheduling problem that seeks to minimize the ML model's inference error, expressed as a penalty function of the Age of Information (AoI) vector of the two modalities. We develop an index-based threshold policy and prove its optimality. Specifically, the scheduler switches to the other modality once the current modality's index function exceeds a predetermined threshold. We show that both modalities share the same threshold and that the index functions and the threshold can be computed efficiently. Our optimality results hold for general AoI functions (which could be non-monotonic and non-separable) and heterogeneous transmission times across modalities. To demonstrate the importance of considering a task-oriented AoI function, we conduct numerical experiments based on robot state prediction and compare our policy with round-robin and uniform random policies (both are oblivious to the AoI and the inference error).n The results show that our policy reduces inference error by up to 55% compared with these baselines.
Tesserae: Scalable Placement Policies for Deep Learning Workloads
Bian, Song, Agarwal, Saurabh, Mahmood, Md. Tareq, Venkataraman, Shivaram
Training deep learning (DL) models has become a dominant workload in data-centers and improving resource utilization is a key goal of DL cluster schedulers. In order to do this, schedulers typically incorporate placement policies that govern where jobs are placed on the cluster. Existing placement policies are either designed as ad-hoc heuristics or incorporated as constraints within a complex optimization problem and thus either suffer from suboptimal performance or poor scalability. Our key insight is that many placement constraints can be formulated as graph matching problems and based on that we design novel placement policies for minimizing job migration overheads and job packing. We integrate these policies into Tesserae and describe how our design leads to a scalable and effective GPU cluster scheduler. Our experimental results show that Tesserae improves average JCT by up to 1.62x and the Makespan by up to 1.15x compared with the existing schedulers.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > Middle East > Saudi Arabia > Asir Province > Abha (0.04)